Virulome analysis of Escherichia coli ST117 from bovine sources identifies similarities and differences with strains isolated from other food animals

Escherichia coli ST117 is a pandemic extraintestinal pathogenic E. coli (ExPEC) causing significant morbidity globally. Poultry are a known reservoir of this pathogen, but the characteristics of ST117 strains from other animal sources have not been adequately investigated. Here we characterize the genomes of 36 ST117 strains recovered primarily from preweaned dairy calves, but also from older postweaned calves and lactating cows, in the context of other bovine-associated strains and strains from poultry, swine, and humans. Results of this study demonstrate that bovine-associated ST117 genomes encode virulence factors (VFs) known to be involved in extraintestinal infections, but also occasionally encode the Shiga toxin, a virulence factor (VF) involved in severe gastrointestinal infections and more frequently identified in E. coli from ruminants than other animals. Bovine-associated ST117 genomes were also more likely to encode afa-VIII (adhesins), pap (P-fimbriae), cdt (cytolethal distending toxin), and stx (Shiga toxins) than were poultry and swine-associated genomes. All of the ST117 genomes were grouped into seven virulence clusters, with bovine-associated genomes grouping into Clusters 1, 2, 4, 5, but not 3, 6, or 7. Major differences in the presence of virulence factors between clusters were observed as well. Antimicrobial resistance genes were detected in 112 of 122 (91%) bovine-associated genomes, with 103 of these being multidrug-resistant (MDR). Inclusion of genomes that differed from ST117 by one multi-locus sequence type (MLST) allele identified 31 STs, four of these among the bovine-associated genomes. These non-ST117 genomes clustered with the ST117 genomes suggesting that they may cause similar disease as ST117. Results of this study identify cattle as a reservoir of ST117 strains, some of which are highly similar to those isolated from other food animals and some of which have unique bovine-specific characteristics.


Introduction
Escherichia coli is a diverse species of Gram-negative bacteria that are commensal members of the mammalian and avian gastrointestinal tracts and can be frequently isolated from the environment.Most E. coli strains are non-pathogenic, but the acquisition of virulence factors (VFs) can result in the emergence of pathogenic strains that can cause disease in humans and animals [1].Currently, there are at least 11 pathovars (pathotypes) that result in various diseases caused by strains with specific combinations of VFs and that may be exacerbated by underlying conditions or age of the human host [2].Treatment of these diseases typically includes antimicrobial therapy, except for infections caused by Shiga-toxigenic E. coli (STEC).However, the acquisition of antimicrobial resistance genes (ARGs) by these strains can make antimicrobial therapy ineffective or difficult to resolve.
Antimicrobial resistance remains a significant human and animal health concern causing increased morbidity and mortality from infections that are difficult to treat [3].ARGs are frequently carried on mobile elements such as plasmids that can transfer between strains [4].Dairy calves and cows and beef cattle are known reservoirs of antimicrobial-resistant bacteria, including MDR E. coli [5][6][7][8].Dairy calves typically shed a greater ratio of resistant bacteria to susceptible bacteria, and the mechanisms for this are not yet elucidated [5,9].Previous research has suggested that diet is related to this difference in resistance carriage by animal age [10][11][12].The iron content of milk is relatively low and this has been hypothesized to select for strains that encode accessory iron scavenging genes (siderophores) which are often colocated on plasmids carrying ARGs [11,12].
E. coli ST117 is a globally distributed extraintestinal pathogenic E. coli (ExPEC) strain that primarily causes bladder infections [13,14].In the United States alone, such infections result in an estimated 7 million medical visits and $1.6 billion in medical expenses on an annual basis [15,16].Further, sepsis, which is frequently caused by ExPEC strains, has been estimated to cause more than 85,000 deaths annually [17].Along with being a known ExPEC strain, ST117 is considered an avian pathogenic E. coli (APEC) as it causes colibacillosis which results in major economic losses for the broiler industry [18,19].Although ST117 has been well-characterized in human infections and poultry in recent years, strains from other animal sources, such as livestock, particularly dairy and beef cattle, are not well-characterized.The aim of this study was to investigate the genomic characteristics, particularly related to virulence and antimicrobial resistance, of ST117 strains from cows, and to compare these to genomes of the well-characterized poultry, swine, and human ST117 strains.

Materials and methods
Genomes of 36 previously isolated E. coli ST117 strains recovered from dairy calves and cows in the United States were gathered from an in-house database of E. coli genomes (S1 File).Additionally, a subset of publicly-available ST117 genomes from bovine, poultry, swine, ovine, mustelid, and human sources were downloaded from the Enterobase database [20].All available bovine, swine, ovine, mustelid, and human-isolated genomes were selected, but 1000 poultry-associated genomes were randomly selected to minimize computational resources while simultaneously analyzing a large number of genomes.Genomes that differed from ST117 genomes by one allele were simultaneously downloaded to identify any closely related non-ST117 genomes.To identify these genomes, the ST Query was set to 117, and the "Max Number MisMatches" in the Achtman 7 Gene MLST Query was set to 1.
All genomes were interrogated for the presence of virulence factors (VFs), antimicrobial resistance genes (ARGs), plasmid replicons, and serotype identification using the Abricate and STECFinder programs [21][22][23].Clusters of similar virulence profiles (VPs) were identified with the Elbow Method [24] to determine the optimal number of clusters using the withincluster sum of squares (WCSS) in the packages "cluster" and "pracma" in R. Differences in the virulome structures between clusters was determined by a PERMANOVA analysis in R. Differences in the proportions of VFs detected between clusters and between hosts (bovine versus poultry, and bovine versus swine) were determined using a Fisher's exact test with the fisher.test command in the "stats" package in R. To account for false positive significant results (false significance at P < 0.05) a false discovery rate correction was applied to the Fisher's exact test P-values with the p.adjust command in the "stats" package in R. Significance was considered at P adj < 0.05.

Diversity, virulence, and resistance among the bovine-associated strains
In total, the genomes of 1536 E. coli ST117 (including 36 in-house bovine-associated genomes) and closely related strains with defined source metadata (bovine, poultry, swine, mustelid, ovine, and human) were selected from the Enterobase database.Of these, 122 were from bovine sources, 257 were from humans, 1000 were from poultry, 152 were from swine, three were from mustelids, and two were from ovine sources (S2 File).
A large-scale analysis of genes involved in virulence identified between 155 and 207 VFs in the 122 bovine-associated genomes (median = 184, mean = 182.2) (S4 File).The ST11597, ST11520, ST10642, and ST10618 genomes encoded 168, 193, 156, and 176 VFs, respectively.The presence/absence of VFs was variable across the bovine-associated strains, even for those VFs known to be integral or involved in the ExPEC infection processes (Fig 2).For instance, 57% of these genomes encoded at least one of the afa genes (afa binding genes), 59% encoded at least one pap gene (P fimbriae), and 92% encoded at least one iucABCD-iutA gene (aerobactin synthesis and receptor).The foc (F1C fimbriae), sfa (S fimbriae), kpsMII (group II capsule synthesis), and dra (adhesins) genes were not detected in any of the bovine-associated genomes.
Following the criteria of Johnson et al., [25], 102 genomes, including the ST11520, ST10642, and ST10618 genomes, were identified as potential ExPECs due to the presence of two of the following: papA and/or papC, sfa and/or foc, afa and/or dra, kpsMII, and iutA (Fig 2).Of the 20 genomes that did not meet these criteria, 19 were ST117 and one was ST11597.In total, 91 isolates were considered uropathogenic (UPEC) based on the presence of three or more of the following genes, chuA, fyuA, vat, yfcV [26].None of the isolates encoded yfcV, including seven genomes that did not meet the ExPEC gene presence criteria and all the genomes with that had one allele difference from ST117 (ST11597, ST11520, ST10642, and  ST10618).There were 24 genomes that met the ExPEC gene criteria but did not meet the UPEC criteria.All of these were lacking the vat gene.
The distribution of other non-ExPEC associated VFs indicated that some of the ST117 isolates were hybrid pathovars in that they encoded VFs that are integral in the pathogenesis of other pathovars (Fig 2).Of the 102 genomes with ExPEC genes, 11 encoded both stx1A and stx2A (Shiga toxin genes) of the STEC pathovar, and 10 of these genomes encoded the UPEC genes vat, fyuA, and chuA (ExPEC/STEC hybrid pathovar).None of the 11 genomes that encoded Shiga toxin genes, encoded the eae or tir genes.
There was a moderate dissimilarity between the virulomes of the bovine-associated and poultry-associated genomes as determined by an analysis of similarities test (ANOSIM R = 0.58, P < 0.001), with approximately 13% of the variance in virulome composition attributed to the distinction between the two groups (PERMANOVA, R 2 = 0.13, F = 168.74,P < 0.001).Further, 58 VFs were significantly enriched in the bovine-associated genome group compared to the poultry-associated genome group (Table 1); enriched VFs included pap (P-fimbriae) genes, afa-VIII (adhesin) genes, cdt-III (cytolethal distending toxin) genes, and stx (Shiga toxin) genes (Fisher's exact test, P adj < 0.05).There were 40 VFs enriched in the poultry-associated genomes compared to the bovine-associated genomes, including iro (salmochelin) genes and the sit (iron transport genes).Similarly, moderate dissimilarities between the virulomes of bovine-associated and swine-associated genomes were identified, with approximately 20% of the variance in virulome composition attributed to the distinction between the two groups (ANOSIM R = 0.38, P < 0.001; PERMANOVA, R 2 = 0.20, F = 70.161,P < 0.001).There were 54 VFs enriched in bovine isolates, including pap, afa-VIII, cdt-III, and stx genes, and 48 VFs enriched in swine-associated genomes, including iro and sit genes (Fisher's exact test, P adj < 0.05) (Table 2).Comparisons of the virulomes of both poultry-associated and swine-associated strains to bovine-associated strains indicate significant moderate differences between these groups as demonstrated by both ANOSIM R statistics.Based on the PER-MANOVA R 2 results, a higher percentage of the variation between the groups was explained for the comparions of swine-associated genomes to bovine-associated genomes (20%) than for the comparisons of poultry-associated genomes to bovine-associated genomes (13%).However, a comparison of the F-statistics indicates that the statistical significance of the poultryassociated genomes to bovine-associated genomes comparison is higher than that of the swine-associated genomes to bovine-associated genomes comparison.

Virulence profile and cluster identification among all genomes
There were 1204 unique virulence profiles (VPs), labeled as VP1 through VP1204, identified in the genomes (S5 File).There were 150 VPs that were identified in more than one genome, and 22 that were identified in more than five genomes.The most frequently identified VPs were VP334, VP611, and VP821 which were identified in 22, 20, and 17 genomes, respectively.In total, there were 98 VPs among the bovine-associated genomes that were not identified in the genomes from any other source animal, five VPs were shared with poultry-associated genomes, three VPs were shared with swine-associated genomes, and one VP was shared with human-isolated genomes (VP178) (Fig 3).The STEC strains were assigned to VP1-VP16, VP248-VP250, and VP347.Based on the elbow method, the optimal number of clusters of similar VPs was 7 to 9 (there are multiple closely related VPs within a cluster).To minimize the number of clusters with very few genomes, seven clusters (labeled as Cluster 1 through Cluster 7) were selected for downstream analyses (Table 3).Clusters 2, 4, and 5 encompassed the most genomes with 636, 394, and 385, respectively.Clusters 1, 6, 7, and 3 were comprised of 56, 54, 7, and 4 genomes, respectively.Bovine-associated isolates belonged to Clusters 1, 2, 4, and 5; with Cluster 5 containing the most bovine-associated isolates (n = 64) followed by 1 (n = 40), 2 (n = 15), and 4 (n = 3).Cluster 4 was the only cluster comprised of isolates from all host animal sources.STEC strains were assigned to Clusters 1, 2, 3, and 4.
There were significant differences in virulome structure between these clusters (Fig 4 ), with Clusters 4 and 5 having the most apparent differences with other clusters, but fewer differences with each other (PERMANOVA, R 2 = 0.07 to 0.27, F-statistic = 132 to 582, P-value < 0.05).When considering Clusters 4 and 5, 7% to 27% of the total variance in VF presence and absence patterns was attributed to the grouping variable with other clusters, with varying degrees of significance.A differential enrichment of some VFs was also observed between the clusters.Of 380 VFs, 145 were differentially abundant in at least one cluster, eight were differentially abundant in two clusters, 29 were differentially abundant in 3 clusters, and 27 were differentially abundant in four clusters.Of all of the cluster comparisons, Clusters 1 and 2 had the most differentially enriched VFs between them, followed by Clusters 1 and 4, Clusters

Discussion
E. coli ST117 is an APEC/ExPEC lineage and a major cause of human morbidity worldwide.Poultry is considered a major reservoir of ST117 [13,27,28]; however, strains from other sources, particularly bovine sources, have not been fully investigated.Recent studies have demonstrated that ST117 is isolated from these animals, particularly young dairy and veal calves, suggesting that these animals may be a potential reservoir of E. coli ST117 [11,29,30].However, many fewer E. coli ST117 strains have been isolated from bovine sources compared with poultry sources so cattle may serve as a minor reservoir of these strains compared to poultry.The prevalence of ST117 in bovine sources in the United States has not been evaluated.
Most of the genomes in this study, regardless of source, encoded ARGs conferring resistance to multiple classes of antibiotics, and can be considered multidrug-resistant (MDR).The majority of the bovine-associated genomes encoded aminoglycoside, ß-lactam, tetracycline, and sulfonamide resistance genes.Antimicrobial administration data was not available for the animals from which these isolates were recovered, so associations between treatment and ARG presence in these strains could not be assessed.ST117 strains are frequently antimicrobialresistant and frequently MDR, suggesting that mechanisms involved in the persistence of the MDR genotype may also be beneficial to this sequence type, or that the maintenance of ARGs does not confer a disadvantage on resistant ST117 strains compared with susceptible ST117 strains.Multidrug-resistant E. coli ST117 can be found in the bovine calf gut, which is an iron-poor environment because of the low iron levels in colostrum and milk [11,12,29].Previous research has identified a positive association between the presence of iron-scavenging genes and the MDR genotype in E. coli collected from young calves [11,12].In this study, 81% of the bovine-associated genomes encoded iucABCD-iutA, 97% encoded fyuA, 99% encoded chuA, and 71% encoded sitABCD, all of which are involved in iron transport from the extracellular environment into the E. coli cell.Genomes of isolates from our collection ("ARS-CC" prefix) were predominantly collected from preweaned calves.A similar trend was observed in ST69  Table 3. Composition of the virulence profile (VP) clusters by source (rows) and cluster assignments of the genomes (columns).The first number represents the number of genomes from each source assigned to that cluster, the first number in the parentheses represent the percentage of genomes from that cluster isolated from each source, and the second number in the parentheses represents the percentage of total isolates from that source that are assigned to each cluster.isolates from an earlier study, in which the majority of isolates encoded a sitABCD iron transport operon and were predominantly isolated from preweaned calves rather than evenly recovered from preweaned and postweaned calves [31].Dairy calves are often born anemic and for the first eight weeks of life are predominantly fed a milk/milk-replacer diet, which is low in iron compared to the forages of older post-weaned animals [32].We have previously hypothesized that this low-iron environment selects for strains that encode accessory siderophores, such as those strains with iucABCD-iutA and sitABCD [11,12].These two iron-scavenging operons are frequently located on an IncFIB plasmid that can simultaneously encodes ARGs [33,34].The seven bovine-associated genomes that do not encode an IncFIB plasmid do not encode these two scavenging systems.The high of the IncFIB plasmid encoding these scavenging systems in ST117 strains and other ExPEC strains may be a factor in the high prevalence of these strains in dairy calves.Further work is needed to evaluate this in vivo.

Source
Variation in the virulence profiles was observed among the genomes and these variants grouped into clusters of similar profiles.The noticeable differences are clearly apparent with the frequency of certain VFs being higher in some clusters than others.This is in part due to the mobile nature of some of these VFs which can frequently be encoded in plasmids and genomic islands.These data suggest that there might be some differences in the virulence potentials of these clusters, but the differences would need to be evaluated in vivo.There was considerable overlap of isolates from different sources within the clusters suggesting that some similar strains may circulate between different hosts.For instance, poultry-associated isolates were recovered from four of seven virulome clusters (Clusters 2, 4, 5, and 6), and bovine-associated isolates were recovered from three of the same clusters (Clusters 2, 4, and 5).However, bovine-associated isolates were the major constituent (71% of isolates) of Cluster 1, which also included human isolates (25%), two swine isolates, and no poultry isolates.
Moderate variations in virulome structures were observed when bovine-associated genomes were compared to poultry and swine-associated genomes suggesting that host species may be involved in the selection of some VFs, many of which are mobile.Interestingly the bovine-associated isolates encoded some VFs that were identified less frequently in nonbovine-associated isolates.Shiga toxin genes (stx) were only identified in bovine-associated isolates and human-recovered isolates, although the health statuses of the latter were not available.While they appear to have no impact on cattle, Shiga toxins can cause severe illness, including hemolytic uremic syndrome (HUS) in approximately 5-10% of people infected with STECs and is sometimes fatal.ExPEC strains typically do not cause infections of the human gut leading to illness.However, the presence of stx indicates that the ST117 hybrid ExPEC/ STEC strains may be able to cause both gastrointestinal infections and extraintestinal infections.Cattle are the primary reservoir of STEC [35,36], yet this animal's carriage of hybrid ExPEC/STEC strains has not been fully investigated and further work should be conducted to better understand the prevalence of these strains in dairy and beef animals, their public health significance, as well as their potential pathogenesis in humans.
Results of this study demonstrate that cows and calves are potential sources of E. coli ST117 and closely related non-ST117 strains, although poultry is most likely the major food animal reservoir of these strains.It appears that multiple food animals may be potential reservoirs of similar and potentially virulent strains.Our work further suggests that preweaned calves may be a primary bovine reservoir of ST117, ST69, and other STs that are frequent carriers of ironscavenging genes, many of which are known to cause extraintestinal infections in humans.The presence of Shiga toxin genes in a small portion of the bovine-associated E. coli ST117 strains suggests these animals are potential sources of ExPEC/STEC hybrid strains.The public health significance of such strains should be more deeply evaluated.Infections cause by ExPEC strains can be difficult to treat due the high prevalence of resistance among these strains.Associations between iron-scavenging genes and ARGs in the bovine calf gut should be further evaluated in an effort to mitigate carriage of MDR ST117 and other virulent MDR E. coli strains by these animals.

Fig 3 .
Fig 3. Venn diagram showing the number of virulence profiles (VP) identified among the genomes from each source and those identified in genomes from multiple sources.The number in the parentheses indicates the percentage of total VP.https://doi.org/10.1371/journal.pone.0296514.g003

Fig 4 .
Fig 4. Differences in the virulome structures of the clusters as determined by a PERMANOVA analysis.The darkness of the blue squares is proportional to the R 2 values.The size of the black circle is proportional to the F-statistic.https://doi.org/10.1371/journal.pone.0296514.g004

S4File.
Virulence factors (VFs) of strains utilized in the study.(XLSX) S5 File.Virulence profiles (VPs) and clusters of strains utilized in the study.(XLSX) S6 File.Percentage of strains within a each cluster encoding each virulence factor.(XLSX)

Table 1 . Virulence factors (VFs) enriched in bovine-associated genomes or poultry-associated genomes when the two groups are compared with each other.
Resultsare from a Fisher's exact test followed by a p-value correction for multiple comparisons.P adj < 0.05 indicates a significant difference in the abundance of VFs when bovine-associated genomes and poultry-associated genomes are compared.